Background¶

Spotify invested over $1 billion into podcasts between 2018 and 2022. The strategy focused on acquiring and producing exclusive podcast content to boost its market presence.

  • "The Joe Rogan Experience": ~$100 million deal, with 11 million listeners per episode.
  • "Archetypes" by the Sussexes: $20 million contract, discontinued after a single season.
  • Partnership with the Obamas: estimated $30million contract

Despite these significant expenditures, financial returns have lagged, with Spotify reporting a net loss of €430 million in 2022 and ongoing deficits. As a result, the company has reevaluated its approach, streamlining operations and canceling underperforming shows.

Spotify now focuses on sustainable growth through content diversification and bolstering podcast advertising. The success of these strategic adjustments in the competitive and fluctuating digital ad space remains to be seen.

Introduction¶

The purpose of this analysis is to examine whether Spotify made the correct choice of focusing on a small number of very expensive podcast deals or whether some other approaches might have made more sense longterm.

Specifically we'll look into:

  1. How successful was Spotify in growing it's podcast listener base?
  2. Did major podcasts outpace others in attracting listeners, justifying Spotify's past investment focus?

We'll be using Podcast Reviews dataset from Kaggle.

Popularity and average top 10% percentile count over time¶

We can see that the share of reviews for the top 5% and especially top 1% of podcasts started increasing significantly after 2018. This implies that their strategy was successful both in attracting new listeners and those listeners disproportionally listened to small number of most popular podcasts compare to the years before.

We see a significant falloff after 2021, this probably has several explanations:

  • media consumption in general decreased in the aftermath of the Covid pandemic.
  • we're using review count as a proxy for popularity, which is problematic because users can only leavy a single review for a podcast so we can't track whether they continued listening to those podcasts.

Some questions we need to consider, if Spotify's investment in podcasts (both specific and overall infrastructure) resulted in significant user growth, did most of these users:

  1. Disproportionately listen to these most expensive/most popular podcasts?
  2. If so, did these new users later engage with other podcasts as well?
  3. Were the new users retained over a longer period, or was there a significant drop-off?
Did most of the growth went to the top 1% of podcasts¶
Hypothesis I¶

Did Spotify's investment and overall strategy of focusing on a small number of creators prove effective? Specifically, did the growth rate in popularity of the most popular podcasts (defined as the top 1st percentile based on the number of reviews) exceed that of other podcasts? Based on this question, we formulate our first hypothesis:

H1: The number of reviews for the most popular podcasts is increasing at a faster rate than for the bottom 99% of all podcasts.

To test this hypothesis, follow these steps:

  1. Transform the reviews_by_month_count_df_after_2015 dataframe to show the monthly growth rate for the top 1% and bottom 99% of podcasts.

Growth rate for top 1%:
Mean: 0.08 (Std Dev: 0.44)

Growth rate for bottom 99%:
Mean: 0.02 (Std Dev: 0.14)

To decide the appropriate test, the following should be considered:

  1. Data should be normally distributed or the sample size should be large.
  2. Variances of the two groups being compared should be equal.

If these assumptions do not hold, a non-parametric test like the Mann-Whitney U Test should be used.

Shapiro-Wilk Test for Normality:
Test Stat: 0.76
P-value: 0.0 (If p-value < 0.05, data is not normally distributed).
This indicates that a non-parametric test should be used.

Levene Test for Homogeneity of Variances:
Test Stat: 40.53
P-value: 0.0
This also supports the decision to use a non-parametric test.

Mann-Whitney U Test:
U-value: 40.53
P-value: 0.0

The p-value indicates a significant difference in growth rates between the top 1% and bottom 99%. This mean we can reject the null hypothesis, which would support our initial hypothesis that the popularity of the top 1% most popular podcast has grown at a significantly faster pace than the rest.

Comparison of Average Growth Rates:
Average growth for top 1%: 0.08
Average growth for the bottom 99%: 0.02

Growth rate for listeners of top 1% of podcasts has been significantly higher than for the remainder of podcasts. This in combination with the rapidly accelerating growth after 2018 would valid Spotify's approach. However we must remain cautious:

  • The podcast sector itself has been rapidly growing across all platforms especially during the Covid pandemic.
  • It's not clear if this still made financial sense, even if Spotify has been successful in increasing its listeners base their cost per gained user turned out to be unsustainable high financially.

Distribution of Podcasts by Popularity:¶

We'll further look into the distribution of podcasts by popularity to get a clearer picture of just how unequal the distribution of listeners is.

We'll use the Lorenz curve (which is commonly used in economic to measure wealth/income inequality and is tied to the GINI index) to visualize the disparity in podcast reviews, as it effectively highlights the degree of inequality.

Index(['podcast_id', 'title', 'num_reviews', 'categories'], dtype='object')
Gini coefficient is: 0.93

Distribution of podcast reviews by percentile (e.g. top 5% of all podcasts have 71.2% of all reviews):

  Percentile  Proportion (%)
0     Top 1%            43.1
1     Top 5%            71.2
2    Top 10%            81.5
3    Top 50%            98.1
User Engagement Analysis¶

In this part we'll look into user listening patterns and the effect of Spotify's recent investments on them. Specifically we want to look into whether the new users attracted to the platform were more likely to listen to 1% of most popular podcasts compared to users who have joined the platform at an earlier point.

Distribution of review count by user: mean: 1.34 median: 1.00 max: 614

stdev: 1.83 skewness: 98.11 *is extremely high and indicates a very strong rightward skewness. This suggests that most of the data values are clustered around the left, with a few extremely large values on the right.

kurtosis**: 21137.37

*direction and degree of asymmetry. A positive skew indicates that the tail is on the right side of the distribution. ** high kurtosis means more of the variance is the result of infrequent extreme deviations.

We can see that the majority of users have left 1 review or less. While a small proportion of users have left a very large number of reviews. There are some users who have written hundreds of reviews which seems somewhat suspicious but since their number is very small it's not inconceivable that some users might have listened to hundreds of different podcasts over 4+ years.

The chart above tracks the number of new users (based on first left review) in a given quarter and show whether the first review a user left was for a top 1% podcast based on popularity.

We can see that the proportion of new users who have first listened to one of the most popular podcasts has been increasing inline with our previous findings.

Hypothesis II¶

H: There is a difference in the number of reviews left during the first 6 months on the platform between users who joined after 2020-01-01 and those who joined before

If we assume based on our prior analysis that a significantly higher proportion of users who have joined the platform after 2020 were attracted by one of the newly acquired or highly popular podcasts we want to check whether these users have stayed on the platform and listened to other podcasts as much as the users who had joined earlier.

Null Hypothesis (H0): There is no difference in the number of reviews left during the first 6 months on the platform between users who joined after 2020-01-01 and those who joined earlier.

We'll again use the Mann-Whitney U Test to check this:

U statistic: 251977982392.5
P-value: 0.0
There is a significant difference in the number of reviews left during the first six months between the two groups.
Old Users - Mean: 1.13, Std Dev: 0.98, Count: 773917
After 2020 Users - Mean: 1.12, Std Dev: 0.99, Count: 648280

The Mann-Whitney U test indicates a significant difference with a P-value of 0.0, it suggests that there is a statistically significant difference in the distributions of the two groups.

However, the means and std. dev are almost identical. This is possibly due to the very large sample sizes and the statistical significance might not be practically meaningful.

If we visualize it the difference is actually somewhat more discernible, users joining after 2020 have left more than 1 review slightly less often than older users. This would indicate that your hypothesis might be accurate and that new users are a bit less likely to explore the platform and listen to other podcasts. However the effect size is very small and probably not practically significant.

Podcast Genre/Genre Analysis¶

In this section we'll examine whether there is significant variance between review distribution based on podcast category/genre.

There seems to be some variance between categories

Summary¶

Our analysis indicates that Spotify's strategy to invest heavily in a select portfolio of podcasts resulted in a skewed growth pattern: the top 1% of podcasts, likely including expensive exclusives, saw review counts—and by proxy, popularity—grow faster than the remaining 99%.

However, post-2021 data (including external sources) reveals a downturn, questioning the longevity of listener interest. While these expensive/popular shows initially captured disproportionate listener attention, whether this translated into a broader, sustained engagement across Spotify's podcast spectrum is unclear. The challenge ahead for Spotify is to leverage early gains from high-profile investments to cultivate a diverse, enduring podcast ecosystem.